History of Astronomy: Part 2
Previously, we discussed the history of astronomy, focusing on developments prior to Newtonian mechanics. In this section, we will continue our exploration of the history of astronomy, focusing on developments from the time of Newton to the present day.
Table of Contents
Newton's Work
Isaac Newton (1643–1727) built upon the work of Galileo and others to develop a comprehensive theory of motion and gravitation. While the Plague ravaged Europe in 1665, Newton retreated to his family estate in Woolsthorpe, where he made several groundbreaking discoveries. His Philosophiae Naturalis Principia Mathematica (Mathematical Principles of Natural Philosophy), published in 1687, laid the foundation for classical mechanics. In the Principia, Newton formulated his three laws of motion, the law of universal gravitation, as well as differential and integral calculus (independently developed by Leibniz). He also published his work on optics in Optiks in 1704, where he described the nature of light and color.
Newton was one of the most intelligent and influential scientists in history. When faced with the Brachistochrone problem, he solved it in a single night, inventing the calculus of variations in the process. By Newton's request, his solution was published anonymously in the 1696 edition of the Journal des Sçavans. Bernoulli, who had posed the problem, recognized Newton's solution and famously remarked, "I recognize the lion by his claw."
Newton's law of gravitation applies to planets (spheres) with spherical symmetry. This can be proven by either integrating over the volume of the sphere or by using Gauss's law for gravity. The latter provides a very elegant proof, but did not exist until the developments of Carl Friedrich Gauss in the 19th century.
We should be vehemently familiar with Newton's laws of motion and gravitation, as they form the basis of classical mechanics. Nevertheless, it is worth briefly reviewing them here. We will do so by deriving Kepler's laws of planetary motion from Newton's law of gravitation. We must first establish the necessary mathematical tools.
Center-of-Mass Reference Frame
In a system of particles, the center of mass (COM) is the weighted average position of all the particles, where the weights are given by their masses.
For a system of
where
If we rearrange and differentiate this equation, we have
where
Anyways, suppose we have a binary system of two particles with masses
If we define the reduced mass
then each particle's position is
There are a few useful identities that we can derive from this.
First, Newton's third law gives
where
This means that we can treat the two-body problem as a one-body problem with mass
where
The gravitational potential energy of the system is
Kepler's First Law
We shall use the center-of-mass frame, and begin by considering the torque on the system
The second term is zero, since
Anyways, we can also write the angular momentum as
Next, consider the acceleration of the vector
so the cross product of
Using the vector triple product identity
We can rewrite the left-hand side as
Integrating both sides with respect to time, we have
with
It is easy to see that
Taking the dot product of both sides with
Using the scalar triple product identity
where
which rearranges to
This is the equation of a conic section in polar coordinates, with one focus at the origin.
If we define the eccentricity
where
Kepler's first law is thus proven: the orbit of a planet is an ellipse. One caveat is that the focus is at the center of mass, not the Sun which Kepler assumed. Kepler could not have known this, for the center of mass is so close to the Sun that it is nearly indistinguishable from it.
Anyways, we can arrange the two equations for
When
Kepler's Second Law
For the second law, we need to understand how to integrate in polar coordinates.
The area element in polar coordinates is
An infinitesimal change in area is thus
and its time derivative is
As the term
Finally, by definition,
Notice that
Kepler's Third Law
Historically speaking, Newton's law of gravitation was formulated to explain Kepler's third law.
We can now demonstrate that Newton's law of gravitation does indeed lead to Kepler's third law.
Begin by integrating Kepler's second law over one full orbit (with period
The area of an ellipse is
From the equation of an ellipse, we have
Finally, squaring both sides, we have
which is Kepler's third law: the square of the orbital period is proportional to the cube of the semi-major axis. Notice that according to this formulation, the square is also inversely proportional to the total mass of the system, which is an observation Kepler could not have made. He referenced the data of Brahe, whose data involved the planets orbiting the Sun, whose mass is so much greater than that of any planet that it is effectively constant.
Virial Theorem
One important result that we will need is the virial theorem.
For a stable, bound system of particles interacting through a potential
where
where
A homogeneous function of degree
We can choose a potential of the form
We will define a new term, the virial
where
where
On the other hand, we can also write the time derivative of
Here,
The term on the right is known as the virial of Clausius. Clausius (1822–1888) was a German physicist and mathematician who made significant contributions to the field of thermodynamics. He is best known for formulating the second law of thermodynamics and introducing the concept of entropy. He also made important contributions to the kinetic theory of gases and the study of heat transfer.
Next, we need to consider all the forces acting on each particle in the system.
Let
Substituting this into the previous equation, we have
We also have
For the rightmost term, due to Newton's third law, we have
where
where
We can take the time average of both sides over a long time period
This is the virial theorem, specifically for a
Light and Optics
Another field that Newton made significant contributions to was optics. In his book Optiks, published in 1704, Newton described his experiments with light and color. He demonstrated that white light is composed of a spectrum of colors, which can be separated using a prism. He also proposed the particle theory of light, suggesting that light is made up of tiny particles called "corpuscles." This theory was later challenged by the wave theory of light, which was supported by experiments such as Thomas Young's double-slit experiment in 1801.
Previously, we discussed how Galileo used a refracting telescope to make astronomical observations. Over the years, the mathematical and physical understanding of optics improved, leading to the development of more advanced telescopes.
The focal plane of a lens or mirror is the plane where light rays converge to form an image.
In astronomy, the focal plane is where the image of a celestial object is formed by the telescope's optics.
We assume that celestial objects are located practically at infinity, so the light rays entering the telescope are parallel.
If such rays approach a lens at an angle
This relationship is also captured by the plate scale, which is a differential relation given by
What does this mean? If we increase the focal length
Stellar Parallax and the Magnitude System
In their original forms, Kepler's laws of planetary motion were based on the relative sizes of the orbits, for their absolute sizes were not known. In 1761, however, the first successful measurement of the distance to Venus was made during a transit of Venus across the Sun (This is credited to Jean‑Baptiste Chappe d’Auteroche.). By observing the transit from different locations on Earth, astronomers were able to use parallax to calculate the distance to Venus and, consequently, the distance from Earth to the Sun (the astronomical unit, or AU). In 1838, Friedrich Bessel made the first successful measurement of stellar parallax, determining the distance to the star 61 Cygni. This parallax effect is extremely useful; our eyes use it to perceive depth, and astronomers use it to measure distances to nearby stars.
If the distance from the Sun to a planet is
from simple trigonometry, where
As one radian is
and we define the parsec (pc, parallax-second) as the numerator
As the distances to stars are so vast, the parallax angles are extremely small. This was why it took so long to notice the effect. Modern instruments are being developed to measure parallax angles. For instance, NASA's planned Space Interferometry Mission (SIM) aims to measure parallax angles with a precision of 1 microarcsecond, allowing for distance measurements to stars up to 10,000 light-years away.
Magnitude Scale
Other than distance, another important property of stars is their brightness.
Hipparchus, who we met earlier, developed the first known magnitude scale for classifying the brightness of stars.
It was quite primitive, using the naked eye to classify stars into six categories, with
For instance, in 1856, Norman Pogson proposed a logarithmic scale for magnitudes, where a difference of 5 magnitudes corresponds to a brightness ratio of 100.
In other words, a
To further understand the magnitude scale, we need to understand how we measure brightness.
Suppose we have a detector with an effective area
To get a more useful quantity, pretend that we have completely surrounded the star with a sphere of radius
Notice that since the flux is a function of distance, it does not matter what the shape of the detector is or its area; the flux is the same everywhere on the sphere. This function is known as the inverse-square law, and it applies to any point source of radiation in three-dimensional space.
Now we can define the absolute magnitude
where
There are two important cases to consider.
First, consider two stars of the same luminosity but at variable distances.
If we let
where
Combining these two equations, we have
The quantity
The second case is when we have two stars at the same distance but with different luminosities.
If we let one of the stars be the Sun with absolute magnitude
We can express the luminosities in terms of their radiant fluxes at a standard distance of 10 pc.
Using the first half of Equation
Similarly, we have
Light and Wave Phenomena
Newton, as he stated in Optiks, believed that light was made up of tiny particles called "corpuscles." One justification he gave was that shadows had sharp edges, which would be difficult to explain if light were a wave. This theory was later challenged by the wave theory of light, which was supported by experiments such as Thomas Young's double-slit experiment in 1801. We now know that light is an excitation of a quantized electromagnetic field, and it exhibits both wave-like and particle-like properties. The wave-like properties come from oscillatory terms in field operators, while the particle-like properties originate from creation and annihilation operators. Anyways, we will briefly review how humans came to understand the wave nature of light.
Initially, light was thought to travel instantaneously, as no delay could be detected by the naked eye.
In 1676, Ole Rømer made the first quantitative estimate of the speed of light by observing the eclipses of Jupiter's moons.
He noticed that the observed times of the eclipses varied depending on the distance between Earth and Jupiter.
By analyzing these variations, he estimated the speed of light to be approximately
In modern times, the speed of light in a vacuum is a defining constant in the International System of Units (SI).
It is exactly
Advancing further, Christian Huygens (1629–1695) proposed the wave theory of light in the late 17th century.
He suggested that light propagates as a wave, similar to sound or water waves.
As such, they had the same mathematical description with quantites like wavelength
Huygens' principle states that every point on a wavefront can be considered a source of secondary wavelets, which spread out in all directions at the same speed as the wave itself. The new wavefront is then the envelope of these secondary wavelets.
One demonstration of the wave nature of light came from Thomas Young's double-slit experiment in 1801. In this experiment, light from a single source passes through two pinholes (nowadays we use slits) and creates an interference pattern on a screen behind the slits. The pattern consists of alternating bright and dark fringes, which can be explained by the constructive and destructive interference of the light waves emanating from the two slits.
If the distance between the slits is
whereas destructive interference (dark fringes) occurs when the path difference is an odd multiple of half the wavelength, i.e.,
This can also be extended to multiple slits, which is the basis for diffraction gratings used in spectroscopy.
In a diffraction grating with
where
Rayleigh Criterion
We return to the discussion of telescopes. In addition to being able to focus light to a point, a telescope must also be able to resolve two closely spaced objects. By "resolve," we mean being able to distinguish the two objects as separate entities, rather than a single blurred object, especially when they are very close together in the sky. The ability to resolve two objects is limited by diffraction, which causes light waves to spread out as they pass through an aperture, such as the lens or mirror of a telescope.
To explain how single-slit diffraction occurs, consider a slit of width
Suppose one ray travels straight through the center of the slit to a point on the screen at an angle
We can repeat this argument for rays originating from other points within the slit, leading to the general condition for destructive interference:
Computer-generated Airy pattern created by a circular aperture. The central bright region is known as the Airy disk, surrounded by concentric rings of decreasing brightness. The first dark ring occurs at an angle
If we have a circular aperture (like in most telescopes), by symmetry, the diffraction pattern will be circularly symmetric.
To analytically derive the diffraction pattern, we will have to double integrate over the circular aperture.
This will come later, but for now we just need to know that Sir George Biddell Airy (1801–1892) worked out the mathematics in 1835.
It is for this reason that the central bright region of the diffraction pattern is known as the Airy disk.
The equation is similar to the single-slit case with
| Ring | ||
|---|---|---|
| Central maximum | ||
| First minimum | ||
| Second maximum | ||
| Second minimum | ||
| Third maximum | ||
| Third minimum |
(Table taken from Carroll & Ostlie, Table 6.1.)
Now the key point is this: suppose we have two point sources of light (e.g., two stars) separated by a small angle

Different superimposed Airy patterns. The leftmost pattern shows two Airy disks that are well separated, allowing for clear resolution of the two sources. The middle pattern shows two Airy disks that are closer together, with their first dark rings overlapping, making it more challenging to resolve them. The rightmost pattern shows two Airy disks that are so close that they merge into a single blurred pattern, making it impossible to distinguish the two sources.
SourceCan we quantify when two point sources can be resolved?
Consider when two sources are close together such that they are not clearly separated.
This means that the angular separation
This is known as the Rayleigh criterion for resolution, named after Lord Rayleigh (John William Strutt, 1842–1919), who formulated it in 1879.
Notice that the minimum resolvable angle
Real-life observations do not always achieve the theoretical limit set by the Rayleigh criterion. This is due to many factors, one of which is the turbulence in Earth's atmosphere, which causes the light from celestial objects to be distorted as it passes through the atmosphere. This effect is known as astronomical seeing. To mitigate this, astronomers use techniques such as adaptive optics, which involve using deformable mirrors to correct for atmospheric distortions in real-time. Additionally, space-based telescopes, such as the Hubble Space Telescope, avoid atmospheric effects altogether by operating above Earth's atmosphere.
Also, as we know, the index of refraction depends on the wavelength of light, leading to chromatic aberration in lenses. This means that different colors of light are focused at different points, causing blurring and color fringing in images. To reduce chromatic aberration, we can introduce correcting lenses made of different types of glass with varying dispersion properties.
Electrodynamics and the EM Spectrum
Prior to the 19th century, there were three interesting phenomena known to physicists: electricity, magnetism, and light. It was not until the work of James Clerk Maxwell (1831–1879) that these phenomena were unified into a single theory known as electromagnetism. Maxwell's equations, published in the 1860s, describe how electric and magnetic fields interact and propagate through space.
As we have seen multiple times, Maxwell's equations predict the existence of electromagnetic waves that propagate at the speed of light. Heinrich Hertz (1857–1894) experimentally confirmed the existence of these waves in the late 1880s. He produced and detected radio waves in the laboratory, demonstrating that they exhibited the same properties as light, such as reflection, refraction, and polarization. Unfortunately, this work was only done ten years after Maxwell's death, so he did not live to see his predictions confirmed.
John Henry Poynting (1852–1914) further developed the theory of electromagnetism by describing the energy flow in electromagnetic fields.
He introduced the Poynting vector
equal to the energy per unit area per unit time (power per unit area) carried by an electromagnetic wave.
Its time average
where
where
This effect is extremely small for everyday light intensities, but it can be significant in astrophysical contexts, such as the pressure exerted by sunlight on comet tails or the concept of solar sails for spacecraft propulsion. When we study early main-sequence stars, accelerating particles, and other high-energy astrophysical phenomena, we will see that radiation pressure can be a dominant force.
Radiation and the Birth of Quantum Mechanics
As previously discussed, Maxwell's equations predict the existence of electromagnetic waves that propagate at the speed of light. These waves can have a wide range of frequencies and wavelengths, forming the electromagnetic spectrum.
When an object is heated, the average kinetic energy of its atoms and molecules increases, causing them to vibrate more vigorously. The vibration of charged particles (such as electrons) produces electromagnetic radiation. The spectrum of this radiation depends on the temperature of the object. This phenomenon was first discovered by Thomas Wedgewood (1771–1805) in the late 18th century, who observed that heated objects emit light.
A perfect blackbody is an idealized object that absorbs all incident radiation, regardless of frequency or angle of incidence.
When a blackbody is heated, it emits radiation with a characteristic spectrum that depends only on its temperature.
To mathematically describe this phenomenon, German physicist Wilheim Wien (1864–1928) proposed Wien's displacement law in 1893, which states that the wavelength
where
where
By the end of the 19th century, physicists and astronomers at the time believed that they had a complete understanding of the laws of physics. Newtonian mechanics and Maxwell's electromagnetism were well-established theories that explained what seemed to be all physical phenomena. It is worth noting how remarkably successful these theories were despite what we now know to be their limitations. This Newtonian paradigm was built upon centuries of scientific progress, some of which we have discussed in this section. We have seen the works of ancient Greek philosophers like Aristotle and Ptolemy, the revolutionary ideas of Copernicus, Galileo, and Kepler during the Renaissance, and the groundbreaking contributions of Newton, Maxwell et al. in the 17th century. By the late 19th century, the scientific community believed that they had a complete understanding of the laws of physics, and many physicists thought that only minor details remained to be discovered.
However, several experimental results in the late 19th and early 20th centuries challenged this Newtonian paradigm and led to the development of new theories.
One of the most significant challenges came from the study of blackbody radiation.
Physicists realized that there was no theoretical basis for the exact shape of the blackbody spectrum.
Lord Rayleigh (1842–1919) attempted to derive the spectrum using electromagnetic theory and classical statistical mechanics.
Specifically, he considered a cavity with perfectly reflecting walls, which contains standing electromagnetic waves.
Let
As they are standing waves, the permitted wavelengths are
where
where
The resolution to this problem came from Max Planck (1858–1947) in 1900. Planck proposed that Wien's law could be slightly modified to fit the entire blackbody spectrum:
This is the important part.
To identify the constants
where
Equivalently, as a function of frequency, it is
This law agreed perfectly with experimental results across all wavelengths and temperatures.
Let's explore how this can be used for a star of radius
where
The
Plugging in Planck's law, we have
known as the monochromatic luminosity of the star, for the wavelength is in an infinitesimal interval so the color does not change. The flux for monochromatic luminosity is known as the monochromatic flux, given via the inverse-square law as
where
recovering the Stefan-Boltzmann law, where we have used the fact that
This derivation shows how Planck's law leads to the Stefan-Boltzmann law and provides a theoretical basis for the effective temperature of stars.
Color and Spectral Classification
With the understanding of blackbody radiation and Planck's law, astronomers could now relate the color of a star to its temperature. An important question was how to classify, measure, and compare the light from different stars.
The magnitude can be classified using the magnitude system we discussed earlier. However, this system does not account for the color of the star. One idea is to use filters to isolate specific wavelength ranges and measure the brightness in those ranges. For example, the UBV photometric system uses three filters: U (ultraviolet), B (blue), and V (visual). The Johnson-Cousins UBVRI system extends this to include R (red) and I (infrared) filters.
We can measure each star's apparent magnitudes physically by observing them through these filters. A filter is essentially a piece of glass or plastic that only allows light of certain wavelengths to pass through. Historically, astronomers used photographic plates with different emulsions to create filters that were sensitive to specific wavelength ranges. Nowadays, we use more advanced materials and technologies to create filters with precise transmission properties.
The difference in magnitudes between two filters is known as a color index.
For instance, the color index
When we measure the apparent and absolute magnitudes over all wavelengths, the result is known as the bolometric magnitude.
However, since we cannot observe all wavelengths (for instance, ultraviolet and infrared light are absorbed by Earth's atmosphere), we often use the visual magnitude
Next, we want to relate the apparent magnitudes measured through different filters to the other physical properties of stars. A star's ultraviolet (U) magnitude, for instance, can be written as
where
The value for
The color index
If we plug in the monochromatic flux from Equation
In other words, the distance
Astronomers can use this relationship to estimate the temperature of a star based on its color index.
If we plot the color index
Summary and Next Steps
In this section, we have explored the historical development of our understanding of stellar parallax, the magnitude system, and the wave nature of light. We have seen how the work of astronomers and physicists over the centuries has led to the modern theories of electromagnetism and quantum mechanics. We have also discussed how these theories have been applied to understand the properties of stars, such as their temperatures and luminosities.
Once again, here are the key takeaways from this section, labeled based on who discovered or developed them. This is not necessarily microscopically chronological; rather, it is grouped by the main contributors to each idea.
- Isaac Newton (1643–1727): Formulated the laws of motion and universal gravitation, laying the foundation for classical mechanics. Developed the reflecting telescope and made significant contributions to optics.
- Jean-Baptiste Chappe d’Auteroche (1728–1769): Made the first successful measurement of the distance to Venus during a transit of Venus across the Sun (1761).
- Friedrich Bessel (1784–1846): Made the first successful measurement of stellar parallax (61 Cygni, 1838).
- Norman Pogson (1829–1891): Proposed the logarithmic scale for stellar magnitudes, establishing the modern magnitude system (1856).
- Thomas Young (1773–1829): Conducted the double-slit experiment, demonstrating the wave nature of light (1801).
- Sir George Biddell Airy (1801–1892): Worked out the mathematics of diffraction through a circular aperture, leading to the concept of the Airy disk and the Rayleigh criterion for resolution (1835).
- Lord Rayleigh (1842–1919): Contributed to the Rayleigh criterion for resolution, and attempted to derive the blackbody radiation spectrum using classical physics (late 19th century).
- Ole Rømer (1644–1710): Made the first quantitative estimate of the speed of light by observing the eclipses of Jupiter's moons (1676).
- Christian Huygens (1629–1695): Proposed the wave theory of light and Huygens' principle, explaining the propagation of light waves (late 17th century).
- James Clerk Maxwell (1831–1879): Formulated Maxwell's equations, unifying electricity, magnetism, and optics into electromagnetism (1860s).
- Heinrich Hertz (1857–1894): Conducted experiments that confirmed the existence of electromagnetic waves, paving the way for the development of radio and wireless communication (1887).
- John Henry Poynting (1852–1914): Introduced the Poynting vector, describing the energy flow in electromagnetic fields (1884).
- Wilhelm Wien (1864–1928): Formulated Wien's displacement law, relating the temperature of a blackbody to the wavelength at which it emits radiation most intensely (1893).
- Josef Stefan (1835–1893): Empirically derived the Stefan-Boltzmann law, relating the total energy radiated by a blackbody to the fourth power of its temperature (1879).
- Ludwig Boltzmann (1844–1906): Theoretically derived the Stefan-Boltzmann law using thermodynamics and statistical mechanics (1884).
- Max Planck (1858–1947): Proposed the quantization of energy and formulated Planck's law of blackbody radiation, laying the groundwork for quantum mechanics (1900).
In the next part of the history of astronomy, we are moving into the 20th century, where we will explore the development of quantum mechanics, general relativity, and other modern theories that have shaped our understanding of the universe.